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ABSTRACT 


Monte  Carlo  Methods  are  introduced  and  used  to  estimate  false  alarm  proba¬ 
bilities.  The  estimation  of  the  latter  is  important  in  the  context  of  performance 
analysis  of  Constant  False  Alarm  Rate  (CFAR)  radar  detection  processes.  A 
CFAR  detector  estimates  the  clutter  level,  producing  a  threshold,  and  a  target 
is  declared  present  if  the  statistic  representing  the  test  observation  exceeds  this 
threshold.  The  latter  is  adjusted  adaptively,  so  that  the  rate  of  false  alarms  is 
held  constant.  Hence,  in  a  radar  analysis  context,  the  performance  of  a  CFAR 
process  can  be  determined  from  whether  it  maintains  a  constant  false  alarm 
rate.  In  order  to  compare  the  performance  of  a  number  of  different  CFAR 
schemes,  in  a  common  clutter  environment,  we  need  to  estimate  these  false 
alarm  probabilities.  This  can  be  done  quite  easily  using  a  basic  Monte  Carlo 
estimator.  However,  the  latter  may  require  a  very  large  number  of  iterations  in 
order  to  produce  a  reasonable  estimate.  To  reduce  this  number  of  iterations, 
importance  sampling  techniques  can  be  used.  To  illustrate  these  techniques, 
we  consider  the  simple  case  of  cell  averaging  CFAR  in  a  Gaussian  environ¬ 
ment,  with  square  law  detection.  This  enables  comparison  of  estimators  with 
an  exact  result. 
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Estimation  of  False  Alarm  Probabilities  in  Cell  Averaging 
Constant  False  Alarm  Rate  Detectors  via  Monte  Carlo 

Methods 


EXECUTIVE  SUMMARY 


Constant  False  Alarm  Rate  (CFAR)  detectors  use  an  adaptive  threshold  to  compare  the 
estimated  clutter  level  with  a  test  statistic,  the  latter  representing  the  cell  under  test. 
When  this  test  statistic  exceeds  the  threshold,  a  target  is  declared  present.  The  adaptive 
threshold  is  adjusted  so  that  the  probability  of  false  alarm,  that  is,  the  probability  of 
declaring  a  target  present  when  there  is  actually  no  target,  remains  fixed.  Performance 
analysis  of  various  CFAR  schemes  focuses  on  whether  the  sample  false  alarm  rate  remains 
constant.  Thus,  in  order  to  compare  CFAR  schemes  in  various  clutter  environments,  a 
comparison  of  the  false  alarm  probabilities  is  required. 

Estimation  of  false  alarm  probabilities,  and  definite  integrals  in  general,  can  be  performed 
using  a  class  of  techniques  known  as  Monte  Carlo  Methods.  A  key  issue  with  such  tech¬ 
niques  is  that  the  estimators  based  upon  them  may  require  a  large  number  of  iterations 
to  produce  a  good  estimate.  In  order  to  improve  this  situation,  by  reducing  the  number 
of  iterations,  Importance  Sampling  (IS)  can  be  used. 

This  report  will  examine  the  estimation  of  false  alarm  probabilities  for  performance  analy¬ 
sis  of  CFAR  detection  processes,  using  Monte  Carlo  methods.  In  particular,  we  will  be 
interested  in  the  performance  of  a  number  of  importance  sampling  estimators.  A  key  is¬ 
sue  to  be  considered  is  whether  the  number  of  iterations  needed  to  produce  a  reasonable 
estimate  can  be  significantly  reduced.  Attention  will  be  restricted  to  the  case  of  a  cell¬ 
averaging  (CA)  CFAR,  in  a  Gaussian  clutter  environment.  This  enables  comparison  of 
estimated  results  to  precise  probabilities  of  false  alarm. 

This  report  begins  with  an  outline  of  CA-CFAR  techniques,  and  Monte  Carlo  Methods, 
followed  by  a  derivation  of  an  exact  analytic  relationship  between  false  alarm  probabilities 
and  thresholds,  in  the  Gaussian  clutter  environment.  Monte  Carlo  IS  techniques  are  then 
introduced  and  used  to  estimate  false  alarm  probabilities.  Finally,  simulation  studies  are 
used  to  demonstrate  the  power  of  the  Monte  Carlo  IS  approach. 
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1  Introduction 

1.1  Monte  Carlo  Methods 


Ill  a  realistic  mathematical  model  of  any  real  world  phenomenon,  it  can  be  expected  that 
the  model  complexity  will  often  translate  to  difficulty  in  applying  it  to  applications  of 
interest.  As  an  example,  in  the  context  of  [Tonkin  and  Dolman  1990],  where  models 
are  constructed  for  the  radar  signal  return  from  a  submarine  periscope,  parameters  in  the 
model  may  be  difficult  to  estimate  precisely.  A  sample  of  observations  can  be  used,  together 
with  an  appropriate  estimator,  to  obtain  an  approximation  to  the  unknown  parameters. 
Due  to  their  statistical  nature,  the  estimator  of  a  parameter  will  have  a  level  of  variance, 
and  it  is  thus  desirable  to  minimise  this,  in  order  to  achieve  a  good  approximation. 

In  many  cases,  these  estimators  are  improved  as  the  sample  size  increases.  However,  this 
can  present  a  number  of  technical,  as  well  as  practical,  problems.  As  an  example  of  this, 
from  a  practical  perspective,  it  may  be  difficult  to  obtain  a  large  sequence  of  accurate 
observations.  Consequently,  an  estimator  may  not  be  able  to  provide  an  acceptable  level 
of  accuracy. 

The  techniques  known  as  Monte  Carlo  Methods  provide  a  systematic  way  of  estimating 
unknown  parameters  via  simulation.  Based  upon  the  Law  of  Large  Numbers,  sample 
averages  are  used  to  estimate  unknown  parameters.  The  power  of  the  method  comes 
from  the  fact  that  it  permits  the  estimation  of  integrals,  and  consequently  can  be  used 
to  approximate  statistical  expectations,  and  hence  probabilities.  The  basic  Monte  Carlo 
techniques  have  existed  for  centuries,  but  the  systematic  development  began  during  the 
1940s,  with  the  development  of  the  atomic  bomb.  The  name  was  coined  by  Metropolis, 
in  the  course  of  the  Manhattan  Project  of  the  Second  World  War,  due  to  the  similarity 
between  statistical  simulation  and  games  of  chance.  In  1948  Monte  Carlo  simulations 
were  used  by  Fermi,  Metropolis  and  Ulam  to  obtain  estimates  for  the  eigenvalues  of  the 
Schrodinger  Equation. 

The  focus  here  will  be  on  the  idea  of  improving  Monte  Carlo  simulations  via  a  procedure 
known  as  Importance  Sampling  (IS),  which  from  a  mathematical  perspective,  involves  a 
change  in  the  underlying  probability  distribution  associated  with  the  estimator. 

From  a  practical  point  of  view,  the  idea  is  that  certain  observations  will  have  more  relevance 
to  the  estimation  than  others,  and  hence  these  ‘important  observations”  are  emphasized. 
Additionally,  since  a  new  estimator  is  constructed,  which  focuses  on  only  a  subset  of 
the  sample  points,  the  simulation  run  time  should  be  reduced.  In  order  to  unbias  this 
estimator,  it  must  therefore  be  weighted  to  compensate  for  the  reduction  in  sample  size. 
It  is  also  an  objective  of  IS  to  reduce  the  variance  of  the  estimator,  and  so  IS  methods  are 
often  referred  to  as  variance  reduction  techniques. 


As  reported  in  [Smith,  Sliafi  and  Gao  1997],  Importance  Sampling  (IS)  began  during  the 
1940s  with  the  ideas  of  von  Neumann  applied  in  the  context  of  particle  splitting  and 
Russian  roulette.  Prior  to  1970  the  main  development  of  IS  was  in  the  context  of  nuclear 
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physics,  with  the  work  of  [Kahn  1950,  1956]  and  [Kalin  and  Mann  1957].  Applications 
in  the  communications  modelling  context  began  to  appear  in  the  literature  during  the 
1970s,  with  the  work  of  [Balaban  1976].  An  early  application  to  the  subject  of  false 
alarm  probability  estimation  appears  in  [Mitchell  1981].  The  latter  application  of  IS 
techniques  will  be  the  main  practical  focus  of  this  report.  [Shanmugam  and  Balaban  1980] 
is  attributed  with  popularising  the  use  of  IS  in  bit  error  rate  estimations.  [Cottrell,  Fort 
and  Malgouyres  1983]  introduced  the  idea  of  using  large  deviation  theory  to  construct  an 
IS  simulation  of  rare  events.  This  resulted  in  a  systematic  development  of  the  method, 
with  applications  to  communications  problems,  with  work  such  as  [Sadowsky  and  Bucklew 
1990].  Later  developments  include  investigations  of  suboptimal  biasing  densities,  as  in 
[Orsak  and  Aazhang  1989]  and  [Gerlach  1999],  as  well  as  a  number  of  different  other 
techniques,  which  will  be  subsequently  considered. 

In  IS,  a  change  of  probability  measure  is  performed  through  a  biasing  distribution.  The 
choice  of  an  appropriate  biasing  distribution  is  the  main  problem  in  the  design  of  an 
effective  and  efficient  Monte  Carlo  (MC)  simulator.  The  “art”  of  good  MC  simulation 
via  IS  is  to  choose  an  appropriate  biasing  density  for  the  problem  under  consideration.  A 
large  percentage  of  the  IS  literature  after  1987  focuses  on  the  selection  of  biasing  densities, 
such  as  [Orsak  1993]  and  [Orsak  and  Aazhang  1989,  1991,  1993].  As  will  be  observed, 
in  cases  of  interest  there  exists  an  optimal  IS  biasing  density,  which  is  not  suitable  for 
use  in  practice  because  it  depends  on  the  parameter  being  estimated.  This  has  resulted 
in  designing  suboptimal  biasing  densities,  based  on  this  ideal  form.  In  many  cases  this 
approach  has  generated  improved  MC  simulations. 

As  remarked  previously,  a  biasing  distribution  is  used  to  emphasize  regions  of  importance 
to  the  estimation  process.  Modification  of  an  unbiased  estimator,  by  reducing  its  input 
sample  size,  will  result  in  a  biased  estimate.  The  resulting  estimator  is  then  unbiased 
through  pointwise  weightings.  The  correct  choice  for  the  weighting  function,  to  produce 
an  unbiased  estimator,  turns  out  to  be  the  Radon-Nikodym  derivative  of  the  original 
distribution,  with  respect  to  the  biasing  distribution.  Equivalently,  when  densities  exist, 
this  reduces  to  a  ratio  of  the  original  density  and  the  biasing  density. 

As  pointed  out  in  [Smith,  Shah  and  Gao  1997],  IS  is  not  without  its  limitations.  A  key 
problem  is  designing  suitable  biasing  densities  as  system  complexities  increase.  It  may  in 
fact  be  very  difficult  to  apply  the  method  successfully  to  complex  systems,  as  illustrated 
in  [Hopmans  and  Kleijnen  1979].  However,  this  does  not  mean  IS  is  completely  useless  in 
complex  system  modelling.  An  example  of  IS  applied  to  a  complex  system,  in  the  setting 
of  simulating  Viterbi  decoders,  is  [Sadowsky  1990].  The  relevant  point  to  keep  in  mind 
is  that  the  method  is  system  dependent,  and  there  may  be  successes  or  failures  in  its 
application,  depending  on  the  specific  system  under  consideration. 

This  study  will  focus  on  IS  applications  to  the  estimation  of  false  alarm  probabilities  in 
a  signal  detection  context.  The  interest  in  estimating  such  probabilities  is  to  facilitate 
the  performance  analysis  of  radar  detection  schemes.  In  particular,  we  are  interested  in 
the  analysis  of  Constant  False  Alarm  Rate  (CFAR)  detection  schemes.  Such  a  detector 
compares  a  test  observation  with  an  adaptive  threshold,  the  latter  produced  so  that  the 
rate  of  false  alarms  remains  constant.  A  false  alarm  is  the  declaring  of  a  target  present 
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when  in  fact  there  is  only  clutter  and  noise  present.  Two  different  CFAR  schemes  can 
be  compared  by  looking  at  the  rate  of  false  alarms,  in  a  common  clutter  environment.  In 
this  case,  we  need  to  control  the  clutter  environment,  so  that  a  valid  comparison  can  be 
performed.  We  will  be  interested  in  the  performance  of  estimators  of  the  probability  of 
false  alarm.  False  alarm  probabilities  are  typically  set  to  small  numbers,  so  that  Monte 
Carlo  estimators  may  require  a  very  large  number  of  iterations  to  produce  a  reasonable 
estimate.  We  will  be  interested  in  examining  the  possibility  of  significantly  improving  this 
drawback,  using  IS. 

We  will  restrict  our  attention  to  a  Gaussian  clutter  and  noise  model.  An  advantage  in 
doing  this  is  that  the  exact  probability  of  false  alarm  is  available,  providing  a  gauge  to 
measure  the  performance  of  estimators. 

The  IS  techniques  considered  here  can  be,  and  have  been,  applied  to  more  general  CFAR 
schemes,  with  non-Gaussian  clutter  models.  See  [Srinivasan  2000,  2001]  for  examples  of 
this.  In  this  report,  we  investigate  a  very  simple  CFAR  scheme  to  compare  these  IS 
estimators. 

We  now  briefly  introduce  CA-CFAR  schemes,  followed  by  an  introduction  to  Monte  Carlo 
techniques. 


1.2  CA-CFAR  Context 


A  CFAR  processor  is  a  signal  processing  tool  which  enables  the  automatic  detection  of  tar¬ 
gets  in  clutter  and  noise.  It  uses  an  adaptive  threshold,  where  targets  are  declared  present 
when  a  detection  statistic  exceeds  this  threshold  value.  This  threshold  is  determined  so 
that  the  rate  of  false  alarms  is  held  constant. 

For  a  specified  level  of  false  alarm  probability  and  noise  level,  a  radar  system  must  increase 
its  transmitted  power  in  order  to  increase  the  probability  of  target  detection  at  a  prescribed 
range.  Thus  we  have  an  optimisation  problem  where  we  seek  to  maximise  the  detection 
probability  subject  to  a  false  alarm  probability  constraint.  In  terms  of  producing  optimum 
detection  criteria,  this  suggests  the  usage  of  a  Ney man- Pearson  test,  or  Bayes  decision 
strategy. 

The  Neyman-Pearson  Theorem  provides  a  mechanism  for  determining  the  form  of  the 
uniformly  most  powerful  test  between  two  statistical  hypotheses.  The  following  result  is 
from  [Beaumont  1980]: 


Theorem  1.1  Let  X\,  X2, . . . ,  Xn  have  joint  probability  density  function 
/(xi,x2,  •  •  •  >  Xn |#i>  #2?  •  •  • >  0k)>  and  suppose  we  want  to  test  the  null  hypothesis  Ho  :  9{  =  9 
against  the  alternative  H\  :  Oj  =  9\7  where  the  9 f  and  9\  are  known  constants  for  each 
i  G  {1,2,... ,  A:},  representative  of  predetermined  possible  states.  Suppose  further  that  U 
is  the  set  of  points  (aq,  £2, . . . ,  xn)  such  that: 
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f{x  1,X2,. 

f(x  1,X2,. 

..,Xn\Ho) 

>C, 


for  some  £  >  0; 


P({Xi,X2,...,Xn)eU\H0)  =  a. 


Then  U  is  a  decision  region  of  size  a  that  has  the  greatest  power 1 . 


The  ratio  of  joint  densities  in  Theorem  1.1  is  a  ratio  of  likelihood  functions,  and  hence 
this  test  is  often  called  a  likelihood  ratio  criterion  test.  This  methodology  can  be  used  to 
construct  a  standard  test  format  in  the  context  of  interest.  [Minkler  and  Minkler  1990]2 
contains  a  detailed  description  of  this,  to  which  the  reader  is  referred. 

In  the  case  of  detection  of  a  single  target  observation,  within  a  series  of  clutter  and 
noise  observations,  in  a  single  radar  return,  we  are  interested  in  testing  the  hypothesis: 
Ho  :  r  =  j]  against  the  alternative  H\  :  r  =  v  +  77,  where  r(t)  denotes  the  radar  signal 
return,  over  a  time  observational  window  [0,  ?/>]  ,  v  —  v(t)  is  a  target  observation  and 
77  =  Tg(t)  is  clutter  and  noise.  It  will  be  necessary  to  assume  some  statistical  properties  of 
the  clutter  and  noise  are  known,  such  as  their  distributions  are  at  least  partially  known, 
and  distributional  parameters  can  be  estimated  from  radar  returns.  This  is  in  particular 
very  important  in  the  context  of  determining  the  adaptive  threshold.  There  are  a  number 
of  models  that  can  be  used  to  simulate  clutter.  In  the  work  to  follow,  we  will  focus  on 
the  simple  case  where  the  clutter  and  noise  are  modelled  as  Gaussian.  This  restriction  is 
applied  because  it  is  possible  to  find,  in  this  case,  an  analytical  relationship  between  the 
false  alarm  probability  and  threshold  parameter,  for  a  given  signal  to  noise  ratio.  This 
allows  easy  performance  analysis  of  Monte  Carlo  estimators.  The  Importance  Sampling 
techniques  considered  here  readily  apply  in  the  case  of  non-Gaussian  noise  and  clutter 
models.  The  difficulty  is  that  it  can  be  hard  to  derive  an  approximate  analytical  solution 
for  comparison. 

Returning  to  the  hypothesis  test  context,  suppose  we  have  a  sample  of  m  clutter  and 
noise  observations  Xi,  X2,  • . . ,  Xm,  and  based  upon  these  we  construct  a  test  statistic 

Y  =  Y(X\,  X2, . . . ,  Xm).  We  assume  we  have  a  test  observation  Ao,  which  is  either  a 
target  or  noise/clutter  observation.  This  observation  is  referred  to  as  the  statistic  of  the 
cell  under  test  (CUT).  Using  the  approach  of  Theorem  1.1,  it  can  be  shown  that  the  form  of 
the  best  test  is  to  reject  Ho  if  X$  >  tY ,  where  r  is  a  constant.  The  physical  interpretation 
of  this  rejection  criterion  is  that  we  compare  the  observation  under  test  to  a  normalised 
test  statistic,  the  latter  being  based  upon  the  clutter  and  noise  observations.  The  statistic 

Y  gives  an  estimate  of  the  magnitude  of  the  clutter.  I11  the  present  context  we  will  only  be 
interested  in  the  case  where  Y  is  a  mean  or  average  value,  so  that  it  gives  a  mean  estimate 
of  the  noise  and  clutter.  Figure3  1  contains  an  illustration  of  this  CA-CFAR  process.  As 
Figure  1  illustrates,  a  series  of  radar  returns  are  passed  to  a  square  law  detector.  Although 
not  clearly  illustrated  in  Figure  1,  the  CA-CFAR  process  actually  has  a  number  of  buffer 
cells  separating  the  CUT  and  the  observational  cells. 

Recall  that  the  power  of  a  statistical  hypothesis  test  is  the  probability  of  rejecting  the  null  hypothesis 
when  the  alternative  is  actually  true.  Thus  it  statistically  measures  the  likelihood  of  making  a  correct 
decision. 

2See  Chapter  3  of  this  book  for  a  full  mathematical  treatment  of  the  classical  detection  problem. 

3  All  figures  and  plots  have  been  placed  in  the  appendix  at  the  end  of  the  report. 
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The  processed  returns  are  then  averaged,  normalised  and  compared  to  the  observation 
under  test.  It  then  outputs  a  detection  decision.  A  radar  detector  divides  a  scan  region  up 
into  a  number  of  small  sets  of  observational  cells,  and  this  process  would  be  run  a  number 
of  times  over  a  scan  region  to  determine  the  false  alarm  probability. 

We  now  introduce  the  basic  ideas  in  Monte  Carlo  estimation  of  probabilities  and  integrals. 


1.3  Monte  Carlo  Fundamentals 


Monte  Carlo  Methods  use  simulation  techniques  to  estimate  integrals.  An  integral  can  be 
estimated  via  a  summation  of  functions  evaluated  at  randomly  generated  numbers.  Since 
a  probability  or  an  expectation  can  be  written  as  an  integral,  the  techniques  can  be  applied 
in  this  context  too. 

Monte  Carlo  estimation  is  based  upon  the  Strong  Law  of  Large  Numbers  (SLLN): 


Theorem  1.2  Suppose  771, 772, . . . ,  r]m  is  a  sequence  of  independent  and  identically  distrib¬ 
uted  random  variables  with  finite  mean  E[rj\.  Then 

m 

L  'h 

j  —  1 

lim  - =  E\ti]  almost  surely.  (1) 

m—>oo  m 

In  a  probability  context,  the  phrase  almost  surely  (a.s.)  means  that  the  result  holds  except 
011  a  set  of  (probability)  measure  zero.  Thus,  for  sufficiently  large  m,  the  average  of  the 
random  variables  in  (1)  can  be  approximated  by  the  mean  value.  To  clarify  this,  consider 
the  integral  /  =  f^lw(x)f(x)dx^  where  /  is  a  density  on  fi,  and  w  is  a  deterministic 
function.  Here  Q  may  be  a  vector  space,  so  that  x  may  be  a  vector.  Hence  integral  I  is 
an  expectation:  suppose  77  is  a  random  variable  on  Q  with  density  /.  Then  we  can  write 
I  =  E[w(rj)].  Now  suppose  771, 772, . . . ,  77m  is  an  independent  and  identically  distributed 
(IID)  sequence  of  random  variables.  Then  the  SLLN  (1)  implies  that 

m 

lim  — - =  E[w(tj)]  =  I,  a.s.  (2) 

7TI-+OQ  m 

Thus  it  follows  that  we  have  the  approximation 

m 

/  =  f  w(x)f(x)dx  — - .  (3) 

Jn  rn 

where  the  sequence  z\,  22, . . . ,  zm  consists  of  realisations  of  the  variables  771, 772, . . . ,  r]m. 

The  approximation  in  (3)  is  called  a  Monte  Carlo  estimate.  This  can  be  used  to  estimate 
integrals  in  a  number  of  different  contexts.  This  method  can  be  used  to  easily  estimate 
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analytically  difficult  integrals.  As  an  example,  consider  the  integral  I  =  f£°  e  xldx.  Since 
we  can  write  the  integrand  as  e~x2+xe~x,  and  the  function  f(x)  =  e~x  is  a  density  on 

Q  =  [0,  oo),  it  follows  that  the  integral  can  be  estimated  by  — — e-[l°s(rj)]  where 

m  “  r7- 

j=i  J 

for  each  j,  rj  are  realisations  of  a  uniform  random  variable4  on  the  unit  interval  [0, 1].  A 
simulation  of  1000  results  gave  I  «  0.8710  while  a  sample  of  size  10,000  yielded  I  ~  0.8854. 
Almost  a  quarter  of  a  million  simulations  produced  the  estimate  /  «  0.8865.  This  compares 
to  the  exact  value  of  I  =  0.5^  «  0.8862.  While  it  may  take  a  large  number  of  simulations 
to  produce  an  accurate  result,  the  technique  is  quite  easy  to  implement,  and  can  be  used 
to  estimate  analytically  intractable  integrals. 


The  Monte  Carlo  estimate  in  (3)  can  also  be  used  to  estimate  probabilities,  with  the  choice 
of  w(x)  =  I[x  G  A],  where  /  is  an  indicator  function,  defined  by 


rr  ^  ai  J  1  if  x  6  A; 

1  ^  |  0  otherwise. 

Let  X  be  a  random  variable  on  Q  with  density  /.  Suppose  we  want  to  estimate  the 
probability  P(X  G  A),  for  some  set  A.  Then  an  application  of  (3)  yields 


P{X  G  A)  =  /  I[x  G  A]f(x)dx 

Jn 

m 

Y  /t:cJ  6  A\ 

j= 1 

rv  _ _ 

m 


(4) 


where  the  sample  :ri,X2, . . .  ,xm  is  IID  generated  from  X.  To  illustrate,  consider  the  case 
where  X  is  an  exponential  random  variable  with  parameter  1,  so  f(x)  =  e~x,x  >  0. 
Suppose  we  want  to  estimate  P(X  <  1).  It  is  not  difficult  to  show  that  P(X  <  1)  = 
1  —  e~l  «  0.6321.  Now  A  =  [0,1]  and  f]  =  [0,  oo),  so  a  Monte  Carlo  estimator  of  this 
probability  is  pmci  =  ^  YljL l  I[xj  ^  1]?  where  the  xj  are  generated  from  the  exponential 
distribution.  However,  there  is  an  alternative  estimator,  since  P(X  <  1)  =  f0l  e~xldx 
and  g(x)  =  1  is  a  density  on  the  unit  interval  [0,1].  Thus  an  alternative  estimator  is 
PMC2  —  ~  Ejii  e~Xj ,  where  the  Xj  are  random  numbers  in  the  interval  [0, 1]. 


A  difficulty  with  the  Monte  Carlo  approach  is  that  it  can  take  a  considerable  number  of 
simulations  to  achieve  a  reasonable  level  of  accuracy.  To  illustrate  this,  recall  that  the 
Central  Limit  Theorem  (CLT)  states  that  for  a  series  {r/j}  of  IID  random  variables  with 
finite  first  and  second  moments, 

Jim  £  nq  1}  (5) 

y/m 

where  p  =  E[p]  and  a2  =  V[rj\.  Using  a  moment  generating  function  expansion  of  the 
sum  of  random  variables  in  (5),  it  can  be  shown  that  the  rate  of  convergence  is  roughly 

4This  is  because  an  exponential  random  variable  with  parameter  1  can  be  simulated  by  —log(r),  with 
r  a  random  number  between  0  and  1. 
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of  the  order  Hence,  for  an  accuracy  of  10  6,  the  number  of  Monte  Carlo  runs  can  be 

roughly  of  the  order  of  1012.  This  can  result  in  considerably  long  run  times.  To  counteract 
this,  the  techniques  of  Importance  Sampling  (IS)  are  used.  This  involves  generating  the 
random  variables  in  (4)  from  a  biasing  distribution,  /*.  The  latter  is  chosen  to  emphasize 
points  in  the  estimation  that  are  more  important  to  the  simulation.  The  effect  of  this 
is  that  the  simulation  is  using  less  points  than  the  standard  Monte  Carlo  estimator.  To 
balance  the  estimation,  and  to  produce  an  unbiased  estimate,  the  estimator  is  weighted 
at  each  point.  Thus,  an  IS  estimate  of  the  probability  in  (4)  is  given  by 


pis  =  -Y.i\z^a\w^ 

til  . 
j=l 


(6) 


where  the  zj  are  generated  from  a  distribution  with  biasing  density  /*,  and  W  is  a  weighting 
function.  To  produce  an  unbiased  estimator,  it  can  be  shown  that  W(z)  =  As  an 

example,  consider  the  problem  again  of  estimating  P(X  <  1)  where  X  is  an  exponential 
random  variable  with  parameter  1.  Choose  as  biasing  distribution  a  uniform  random 
variable  on  the  interval  [0, 1]  .  Now  f(z)  =  e~z  and  f*(z)  =  1  on  the  unit  interval,  and 
so  the  IS  estimator  is  pis  =  T  YljL i  e~Zj  •  This  is  the  same  as  the  second  Monte  Carlo 
estimator  p\ic2  considered  previously. 


Before  considering  IS  estimation  in  more  detail,  we  firstly  derive  an  analytical  expression 
for  the  probability  of  false  alarm  in  a  CA-CFAR,  in  the  case  of  Gaussian  clutter  and  noise. 
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2  A  Cell  Averaging  CFAR  Model 


Ill  this  section  we  will  construct  a  model  for  a  CA-CFAR  detector  operating  in  the  presence 
of  Gaussian  clutter  and  noise.  A  useful  relationship  between  the  CA-CFAR  probability  of 
false  alarm,  and  the  threshold  multiplier  will  be  derived.  The  latter  will  provide  a  useful 
mechanism  for  testing  the  efficiency  and  accuracy  of  Monte  Carlo  estimators  considered 
in  this  report. 


2.1  From  Signal  Return  to  Envelope  Detection  Output 

To  give  a  context  to  the  analysis  to  follow,  we  briefly  consider  the  radar  signal  return 
and  its  characteristics  under  some  assumptions.  [Levanon  1988]  and  [Minkler  and  Minkler 
1990]  both  contain  good  discussions  on  the  processing  of  the  signal  return  before  it  is 
passed  into  the  CFAR  detector.  In  a  simulation  of  a  CA-CFAR  system,  one  can  generate 
results  from  the  point  of  view  of  simulating  the  process  after  square  law  detection,  since 
the  statistics  introduced  previously  were  defined  in  this  part  of  the  CFAR  model. 

A  case  in  which  the  threshold  parameter  r  can  be  obtained  analytically  in  terms  of  the 
false  alarm  probability,  is  when  the  clutter/noise  is  assumed  to  be  Gaussian.  In  this 
case,  the  resulting  random  variables  from  the  square  law  detector  are  exponential.  We 
now  demonstrate  this,  based  upon  the  approach  of  [Levanon  1988],  and  also  derive  an 
expression  for  the  probability  of  false  alarm  as  a  function  of  the  threshold  parameter. 

The  following  is  for  single  pulse  detection.  The  transmitted  signal  is  a  sine  wave  of  duration 
and  frequency  u>c-  The  returned  signal  will  be  a  phased  shifted  version  of  the  originally 
transmitted  signal,  except  with  the  addition  of  noise  and  clutter.  The  radar  return  is 
passed  to  a  narrow  bandpass  filter,  with  centre  frequency  ujc •  We  assume  this  filter  has  a 
rectangular  response  with  bandwidth  /#.  Then  assuming  that  /#  >  the  returned  pure 
signal  can  be  described  by 

so(t)  =  Acos(u)c(t)  —  (f>s)  =  acos(uct)  +  6sin(u;ct),  (7) 

where  (j>s  =  arctan  is  the  phase  shift  of  the  signal,  and  the  amplitude  A  =  s/a2  +  b 2. 
When  Gaussian  noise  is  passed  through  a  narrow  bandpass  filter,  the  output  can  be  written 
as 

n0(t)  =  X(t)  cosset)  +  y(i)sin(u;c£),  (8) 

where  X(t)  and  Y(t)  are  both  independent  Gaussian  random  variables  with  mean  0  and 
equal  variance,  a2.  By  combining  (7)  and  (8),  the  radar  signal  return  at  the  detector  can 
then  be  written  as 


C(t)  =  s0(t)  +  no(t) 

=  (a  +  X ( t ))  cos (uJct)  +  {b  +  Y ( t ))  sin (u>ct)  (9) 

=  R(t)  cos (uct  +  $(£))> 
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where 


R(t)  =  \J(a  +  X(t))2  +  (b  +  Y ( t ))2  and  <£(£)  =  arctan 


b  +  Y{t)  1 
a  +  X(t). 


(10) 


It  is  desirable  to  obtain  an  understanding  of  the  distribution  of  the  amplitude  R(t)  in 
equation  (10),  since  amplitude  characteristics  of  return  signals  are  used  in  the  detection 
of  targets.  Note  that  the  square  of  R(t)  is  a  sum  of  the  squares  of  two  Gaussian  random 
variables:  let  X\  =  a  +  X(t)  and  Y\  =  b+  Y(t).  Using  the  joint  distribution  of  (X\y  Yi), 
and  changing  to  polar  coordinates  (/?,  <J>)  defined  through  the  transformations  implied  by 
(10),  it  can  be  easily  shown  that  the  transformed  joint  density  is 


/(R,$)(r>  4) 


wexp 


r2  +  a2  +  b2  —  2  ra  cos  0  —  2rb  sin  <j> 

2^ 


(11) 


where  a2  is  the  common  variance  of  A" i  and  Y\.  To  obtain  the  probability  density  function 
(pdf)  of  the  amplitude  R(t),  we  integrate  the  density  (11)  over  all  phases  to  obtain 


/fl(r) 


r‘2n 

J  /(H,<t>)(r)  4>) 


r 

r2  +  A2 

rA 

—9  exp 

C 7Z 

2  a2 

lo 

.  v2 . 

(12) 


where  Iq  is  the  modified  Bessel  function  of  order  zero0.  Note  from  (12)  that  in  the  case  of 
no  signal  present  in  the  return,  so  that  >1  =  0,  the  pdf  of  the  amplitude  of  the  return  is 
Rayleigh  with  parameter  a. 

The  amplitude  is  then  passed  through  a  square  law  detector,  so  we  are  interested  in  the 
resulting  distribution  from  squaring  R.  Consider  the  normalised  variable  Z  =  R2.  It  can 
be  shown6  that  its  pdf  is  given  by 


fz{z)  = 


1 

2<t2  (l  + 


exp 


z 

w  (i  +  Wj 


(13) 


Hence  the  radar  returns,  processed  through  the  square  law  detector,  are  exponentially 
distributed.  The  ratio  ^  :=  S  is  a  measure  of  the  signal  to  noise  strength.  Thus, 
based  upon  (13),  we  can  simulate  the  CFAR  system  with  Gaussian  noise  and  clutter, 
by  simulating  exponentially  distributed  variables.  In  the  case  of  no  target  in  the  CUT, 
the  statistic  for  it  will  have  an  exponential  distribution  with  parameter  —j,  since  the 
amplitude  >1  =  0.  When  there  is  a  target  in  the  CUT,  this  distribution  is  the  same,  except 
the  parameter  becomes  2<r*{i+s)>  ^le  Parame^er  &  controlling  the  significance  of  the 
target  return. 


In  the  next  section  we  use  this  result  to  formulate  a  statistical  test  for  the  presence  of  a 
target  in  Gaussian  noise. 

5See  [Levanon  1988]  for  an  integral  expression  for  this.  This  integral  depends  on  the  initial  phase  (j>s . 

6See  [Levanon  1988],  problem  3.1  for  the  details. 
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2.2  Exponentially  Distributed  Post-Envelope  Returns 


To  illustrate  the  CFAR  detection  scheme  for  a  single  pulse,  we  consider  the  simple  case 
formulated  in  [Gandhi  and  Kassam  1994].  In  the  latter,  it  is  shown  that  the  CA-CFAR 
detector  is  optimal  for  the  detection  of  Swerling  1  type  targets.  This  is  in  the  case  where 
Gaussian  noise  is  passed  into  a  square  law  envelope  detector,  equivalent  to  homogeneous 
exponential  clutter.  The  set  up  of  the  previous  Subsection  provides  the  framework  for  this 
approach.  Here  we  suppose  the  radar  signal  returns,  already  processed  by  a  square  law 
detector,  are  IID  exponential  random  variables,  motivated  from  the  above  analysis. 

Suppose  X\,  X*2, . . . ,  Xm  is  a  sample  of  such  a  return  of  pure  clutter/noise  observations, 
so  that  each  Xj  has  an  exponential  distribution,  with  parameter  jj.  This  implies  the  mean 
of  this  distribution  is  p.  With  reference  to  (13),  and  the  comments  proceeding  (12),  it 
follows  that  p  =  2er2  (since  A  =  0).  The  latter  is  the  mean  noise  plus  clutter  power.  The 
CUT  statistic  Xo  is  either  noise  and  clutter,  and  so  is  identically  distributed  to  the  sample 
above,  or  a  radar  target  return  in  the  presence  of  noise  and  clutter.  We  assume  in  the 
latter  case  that  Xo  has  an  exponential  distribution  with  parameter  where  S  is  the 

signal  to  noise  ratio  of  a  Swerling  1  target,  as  defined  previously,  and  motivated  from  (13). 


Hence,  to  test  for  the  presence  of  a  target,  we  perform  a  test  of  Ho  :  p  —  p  against 
the  alternative  H\  :  p  =  p{\  +  5),  assuming  the  test  observation  Xo  has  an  exponential 
distribution  with  parameter  The  form  of  the  test  is  to  reject  Hq  if 


Xo  r 

YflLiXi  ™ 


where  r  is  a  threshold  multiplier.  Thus  the  adaptive  threshold  for  this  CA-CFAR  scheme 
is  ^  Y^JLi  xj->  based  upon  a  realised  sample  #1,2:2, . . .  ,  #m.  The  probability  of  false  alarm 
Pfa  can  be  determined  through 


Pfa  —  P(rejecti/o|#o  is  true) 


=  p 


m  r- 1 


Ho  is  true 


(14) 


Under  Ho ,  Xo  has  an  exponential  distribution  with  parameter  j-r  The  sum  Xj  of 

similarly  defined  independent  exponential  random  variables  has  a  Gamma  Distribution7 
7  (ra,  Such  a  distribution  has  density 


fz{z) 


/xr(m)  V/V 


(15) 


where  T  is  the  Gamma  function,  defined  as  the  integral 

roc 

T(m)  =  /  xm~le~xdx  =  (m  —  1)!, 

Jo 

7See  [Beaumont  1980],  Appendix  1,  noting  that  an  exponential  variable  is  also  Gamma. 
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where  the  latter  equality  holds  only  if  m  is  a  nonnegative  integer.  Here  the  random  variable 
Z  =  i  Xj.  Hence,  applying  (15)  to  (14),  we  obtain 


(16) 


Thus,  for  a  given  threshold  multiplier  r  and  number  of  cells  m,  the  probability  of  false 
alarm  can  be  obtained  exactly,  as  given  by  (16).  This  result  provides  a  means  of  testing 
the  performance  of  Monte  Carlo  estimators  for  both  the  probability  of  false  alarm  and  the 
threshold,  in  the  case  of  Gaussian  clutter  and  noise. 

From  the  CA-CFAR  detector  point  of  view,  a  target  in  the  CUT  is  declared  present  if 


It  is  also  interesting  to  note  that  the  probability  of  detection,  Pd,  can  be  found  in  a  similar 
manner.  Noting  that  this  is  the  probability  that  Hq  is  rejected  when  H\  is  true,  it  follows 
that 


1 


(17) 


Observe  that  both  the  probability  of  false  alarm  and  detection  are  independent  of  the 
mean  clutter  and  noise  parameter  [i.  The  probability  of  detection  (17)  has  been  included 
for  completeness.  We  will  not  investigate  it  further  in  this  report. 


We  now  turn  to  Monte  Carlo  techniques  for  the  estimation  of  false  alarm  probabilities  in 
CA-CFAR  detectors. 
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3  Estimation  of  False  Alarm  Probabilities 


The  purpose  of  this  Section  is  to  illustrate  the  application  of  Monte  Carlo  methods  to  the 
estimation  of  false  alarm  probabilities,  in  a  CA-CFAR  context.  The  problem  is  formulated 
as  a  statistical  test,  and  standard  Monte  Carlo  and  Importance  Sampling  estimators  are 
introduced.  Two  biasing  densities  are  then  considered,  as  well  as  simulation  gain  analysis. 
Finally,  an  alternative  method  of  performing  IS  simulations  is  introduced.  Simulation 
studies  of  the  performance  of  these  estimators  will  be  the  subject  of  the  final  Section. 


3.1  Standard  Monte  Carlo  Methods 


Suppose  we  have  m  observational  cells,  with  observational  cell  statistics  x\ ,x%,  -  -  -  ,  xm. 
Let  xo  be  the  statistic  for  the  cell  under  test  (CUT).  For  ease  of  notation,  let  x  = 
(a?o,  xi, . . . ,  xm).  Additionally,  let  To  =  We  perform  a  test  of  Hq :  no  target  in  the 
CUT,  against  the  alternative  H\\  there  is  a  target  in  the  CUT.  The  test  statistic  is 
D(x)  =  xo  —  To  jyj1- 1  Xj.  Based  upon  the  analysis  in  Subsection  1.2,  the  form  of  the  test  is 
to  reject  Ho  if  D{x )  >  0.  The  region  where  we  reject  Hq  is  called  the  critical  region.  The 
CA-CFAR  adaptive  threshold  is  7o5I/=iXj,  with  To  called  a  threshold  multiplier.  The 
threshold  multiplier  is  used  to  control  the  false  alarm  probability,  so  that  the  rate  of  false 
alarms  is  constant  in  the  CFAR  process. 


The  probability  of  false  alarm  Pfa  is  the  probability  of  declaring  a  target  present  in  the 
CUT,  when  there  is  actually  no  target  present.  Mathematically,  this  can  be  written 

PFA  =  P(  reject  Hq\  Hq  is  true  )  =  P(D(X)  >  0|  Ho  is  true  ), 


where  X  =  (Xo,  Xi, . . . ,  Xm).  Assume  X  has  density  /  on  Q  under  Ho-  Typically,  for 
the  purposes  of  modelling,  we  assume  the  observational  cell  statistics  are  independent 
and  identically  distributed.  Hence,  under  Ho,  the  marginal  statistics  in  vector  X  are  IID 
clutter  statistics.  Thus  the  probability  of  false  alarm  can  be  written 


PFA  =  [  I[D(x )  >  0 }f(x)dx 

J  n 

=  /  p(x)f(x)dx, 

Jq 


(18) 


where 


n(x)  =  I[D(x )  >  0]  = 


( 


1 

0 


if  D(x)  >  0; 
otherwise. 


and  is  called  an  indicator  function.  Hence,  applying  the  Strong  Law  of  Large  Numbers 
(1),  and  using  the  approximation  (3),  a  Monte  Carlo  estimate  of  the  probability  of  false 
alarm8 


N 

pmc = y. 

3= 1 


P{Xj) 

N  ’ 


(19) 


8Throughoujt,  a  hat  over  a  variable  will  indicate  that  it  is  an  estimate  or  estimator.  Thus,  x  is  an 
estimate,  and  X  is  an  estimator. 
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where  Xj  =  (;Tq,  x\,  . . . ,  xJm)  is  generated  from  a  distribution  with  density  /.  Here,  the 
notation  x[  refers  to  the  ith  element  of  the  j th  simulation  vector.  N  is  the  number  of 
simulat ions/ recursions  in  the  Monte  Carlo  estimate.  The  estimator  version  of  this  is 

PZc  =  £  (20) 

j  =  1 

with  Xj  a  random  vector  version  of  Xj.  It  is  not  difficult  to  show  that  P\\c  is  an  unbiased 
estimator  of  Pfa,  with  variance  that  decreases  to  zero  as  the  sample  size  increases: 

E[P^c]  =  Pfa  and  V[p£c)  =  PpA^  ~  PpA\  (21) 

As  pointed  out  previously,  the  difficulty  with  this  estimator  is  its  rate  of  convergence. 
Thus  we  look  at  alternative  ways  of  estimating  false  alarm  probabilities. 


3.2  Importance  Sampling  Estimators 


From  a  purely  mechanical  perspective,  the  idea  of  importance  sampling  is  to  generate  the 
sample  vectors  Xj  in  the  estimate  (19)  from  a  different  density.  In  a  simulation,  certain 
values  of  the  input  variables  will  have  more  importance  to  the  estimation  than  others. 
In  IS,  one  chooses  a  new  density  to  emphasize  these  important  points.  Since  one  is  then 
sampling  from  a  smaller  population,  it  is  possible  to  gain  a  reduction  in  variance. 


To  clarify,  consider  the  problem  of  integral  estimation  via  Monte  Carlo  methods,  as  out¬ 
lined  in  Section  1.3.  In  many  cases,  a  function  of  interest  has  most  of  its  mass  located 
within  a  certain  region,  such  as  the  function  f(x)  =  e~x  ,  for  x  G  R.  In  this  particular 
case,  the  function  is  bell  shaped  with  most  of  its  mass  lying  within  an  interval  symmetric 
about  the  origin.  Importance  sampling  attempts  to  increase  the  density  of  points  within 
these  important  regions. 

The  effect  on  the  estimator  (20),  when  simulating  from  a  different  density,  is  that  it  is  no 
longer  an  unbiased  estimator  for  the  probability  of  false  alarm.  To  address  this  issue,  each 
sample  point  in  the  series  of  (20)  is  weighted  to  produce  a  centered  estimate.  The  new 
distribution  used  for  simulation  is  called  a  biasing  distribution ,  and  its  density  is  referred 
to  as  a  biasing  density.  We  now  turn  to  a  mathematical  treatment  of  these  ideas. 


Suppose  we  sample  from  a  biasing  distribution  with  density  /*.  Then  the  modified  esti¬ 
mator  for  probability  of  false  alarm  is 


M 


Pis  =  Y. 

j  =  1 


n{Zj)W{Zi) 

M 


(22) 


where  Zj  =  {Z&,  Z{,...,  Z]m)  is  generated  from  a  distribution  with  density  /*,  and  W(z)  is 
a  weighting  function.  In  order  to  produce  an  unbiased  estimator  for  Pfa,  if  can  be  shown 
that 


W{z)  = 


M 

/*(*)’ 
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With  this  choice,  the  variance  of  estimator  (22)  is 


in  f*{x) 

Observe  that  in  the  case  where  the  biasing  density  is 

/♦Or)  =  Pp\n(x)f(x), 


(23) 


(24) 


the  variance  in  (23)  is  reduced  to  zero.  This  is  known  as  the  optimal  solution.  It  is,  however, 
not  implementable  because  it  depends  on  the  unknown  probability  of  false  alarm.  However, 
its  form  indicates  the  characteristics  of  a  good  suboptimal  biasing  density.  It  suggests  that 
a  suboptimal  biasing  density  should  be  proportional  to  the  original  density  /.  Additionally, 
all  of  its  mass  is  concentrated  on  the  critical  region.  A  function  constructed  according  to 
these  specifications  must  then  be  weighted  by  a  constant  to  produce  a  probability  density. 


Using  these  considerations,  a  number  of  authors  have  produced  suboptimal  biasing  den¬ 
sities  [Gerlach  1999,  Orsak  1993  and  Orsak  and  Aazhang  1989].  Two  such  densities  are 
now  considered,  in  the  context  of  the  CA-CFAR  model  we  are  examining. 


3.2.1  Ad  Hoc  Biasing  Density 


A  simple  and  widely  used  biasing  density  is  the  so-called  Ad  Hoc  or  variance  scaled  density. 
It  is  often  used  because  of  its  simplicity,  but  does  not  always  result  in  consistently  improved 
results.  Scaling  the  input  vector  of  samples  by  a  constant  produces  the  biasing  density. 
Although  not  attributed  to  a  single  author,  it  is  discussed  in  a  number  of  publications 
[Gerlach  1999,  Smith,  Shaft  and  Gao  1997,  and  Srinivasan  2000].  In  view  of  Section  2, 
assume  the  clutter  observations  are  IID  exponential  with  parameter  for  some  fixed  p. 
This  means  the  clutter  mean  value  is  also  p.  Hence,  the  joint  density  of  (Xo,  X\, . . . ,  Xm) 
under  Ho  is 

m  1  1  v-'m 

/(*)  =  n  Mxi)  =  — (25) 

j= o  ^ 

The  biasing  density,  for  some  a  <  1,  is 

nr  /  \ 

/*(;r)  =  am+1/(ai)  =  II  (”)  e~»Xj-  (26) 

j=  o  v/t/ 


Hence  the  biasing  distribution  has  exponential  marginals,  with  parameter  due  to  the  fact 
that  it  is  a  product  of  independent  exponential  random  variable  densities.  The  weighting 
function  is 


W(x) 


/(*) 

/»(*) 


(27) 


Now  it  can  be  shown,  by  applying  (25)  and  (26)  to  (23),  that  for  this  Ad  Hoc  biasing 
density, 


V*(Pis)  = 


M 


1 


1  m+1 


a(2  —  a). 


Pfa  -  P\ 


FA 


(28) 
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Since  a  <  1,  MV,{Pts)  >  PFA  -  P*A  =  NV(PMC ),  so  that  for  the  same  number  of 
simulations,  the  IS  estimator  has  larger  variance  than  the  standard  MC  estimator. 


The  simulation  gain  T  gives  a  measure  of  the  number  of  standard  Monte  Carlo  simulations 
that  are  required  to  perform  with  the  same  level  of  variation  as  an  IS  estimator.  We  are 
thus  looking  for  cases  where  the  gain  is  large,  since  this  will  imply  the  IS  estimator  will 
perform,  with  the  same  level  of  accuracy  as  a  standard  Monte  Carlo  estimator,  for  less 
simulation  runs.  Mathematically,  the  gain  is  obtained  by  equating  the  variance  expressions 
in  (21)  and  (23),  and  solving  for  the  ratio  of  simulation  runs,  namely  -£j.  For  the  same 
level  of  variance,  the  simulation  gain  in  the  current  context  is 


r  = 


N_ 

M 


1  -  Pfa 


(29) 


and  since  0  <  a  <  1,  it  follows  that  F  <  1.  Thus,  for  the  same  level  of  variation,  the  IS 
Ad  Hoc  approach  does  not  reduce  the  number  of  simulation  runs.  This  does  not  mean 
that  this  approach  is  entirely  useless.  In  some  cases,  an  Ad  Hoc  biasing  density  has  been 
found  to  provide  improvements  in  estimation,  when  compared  to  the  standard  Monte  Carlo 
estimator. 


3.2.2  Chernoff  Biasing  Density 


The  Chernoff  IS  method  was  introduced  in  [Gerlach  1999].  In  this  case,  the  indicator 
function  in  the  optimal  solution  (24)  is  replaced  by  an  exponential  function  of  the  detec¬ 
tion  statistic.  It  then  turns  out  that  the  simulation  gain  is  inversely  proportional  to  the 
simulation  reduction  factor.  The  latter  is  referred  to  as  a  Chemoff-like  bound  [see  Van 
Trees  1971].  In  some  cases,  this  can  result  in  quite  substantial  simulation  savings. 

Observe  that,  for  a  fixed  A  >  0,  eXDW  >  /x(x).  Thus  we  introduce  a  biasing  density 

/.(*)  =  -±—f{x)eXD(*\  (30) 


for  some  A  >  0,  such  that  PcW  —  /p  f(x)eXD^dx  <  oo.  It  is  not  difficult  to  see  that 
PFA  <  Pc{ A)-  Hence  we  choose  a  A  to  minimise  Pc(\).  Assuming  the  same  clutter  model 
(25),  and  using  the  fact  that  D(x)  =  xq  —  tq  1  xj  ■  it  can  be  shown  that 


1 

M A) 


=  H 


ra+1 


i 


1  . 

-  +  Ar0 

P 


(31) 


and 


e"[i+Hxi. 


(32) 


Hence  the  biasing  distribution  of  X  =  (Xo,  Xi, . . . ,  Xm)  has  IID  marginals,  with  Xo  having 
an  exponential  distribution  with  parameter  ^  —  A,  and  each  Xj  having  an  exponential 
distribution  also,  but  with  parameter  jl  +  Atq,  for  j  E  {1,2,...,  m}. 
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The  weight  function  for  the  IS  estimator  can  be  shown  to  be 

eA  [-ZO+TO  I™  1  *j] 

W{X)  =  [1  —  /zA][l  +  /xAro]m 
By  applying  (30)  to  (23),  it  can  be  shown  that 

V.(PTs)  <  [PfaPc( A)  -  P2fa]  , 

and  so  for  the  same  level  of  variance  as  the  standard  MC  estimator,  we  have 


N  " 

M  ~  Pc( A) 


(33) 


(34) 


Tims,  depending  on  whether  PcW  is  sufficiently  small  or  not,  there  is  a  possibility  of 
improved  performance  with  this  IS  biasing  density.  [Gerlach  1999]  points  out  that,  in 
some  cases,  Pc(A)  is  within  a  few  orders  of  magnitude  of  Pfa,  and  so  the  simulation  gain 
(34)  can  be  large.  We  now  demonstrate  this  for  the  particular  case  of  CA-CFAR  under 
consideration. 


In  view  of  (34),  we  want  to  minimise  Pc  (A).  It  can  be  shown  a  local  minimum  exists  and 
occurs  at 

\  mr0  -  1  /ocX 

Kpt  = - ? — — rr-  (35) 

HTo[m  +  1] 

Substituting  (35)  into  the  Chernoff  factor  (31),  we  obtain 


PciKpt)  =  ("5)  +  1)" 


i  +  i'mr 

m 


r0(m  +  1) 
T0  +  1 


(36) 


Observe  that  (36)  is  independent  of  the  clutter  parameter  f.i.  Hence,  with  reference  to 
(16),  we  can  write  (36)  as 


Pc{KPt)  =  Pfa  ^1  -  Pfa)  (m  +  1)  ^1  +  —  ^  ,  (37) 

where  we  have  used  the  fact9  that  tq  =  An  application  of  (37)  to  (34),  together  with 
a  geometric  series  expansion10,  establishes  that  the  simulation  gain  will  be 


Thus  the  gain  can  be  of  the  order  of  the  reciprocal  of  the  probability  of  false  alarm, 
and  hence  can  be  quite  large.  Although  the  gain  (38)  can  provide  substantial  simulation 
savings,  it  turns  out  that  an  alternative  scheme  provides  a  larger  gain  over  all  other  IS 
estimators. 

9See  Section  3.1. 

10Recall  that  l  +  r  +  r2  +  r3  +  ...=  :,  provided  |r|  <  1. 
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3.3  The  G-Function  Method 


We  now  consider  an  alternative  approach  to  Importance  Sampling.  A  number  of  authors 
have  suggested  modification  of  the  form  of  the  IS  estimator  (22).  [Smith  and  Orsak  1995 
and  Srinivasan  2000]  are  examples,  and  here  we  will  focus  on  the  G-Function  method  of 
[Srinivasan  2000].  This  approach  involves  biasing  only  the  observational  cell  statistics, 
and  then  estimating  a  mean  value  of  a  function  of  a  conditional  cumulative  distribution 
function.  Although  this  method  may  seem  somewhat  complicated,  its  appeal  is  that  it 
can  be  shown  that  for  the  same  number  of  simulation  trials,  it  will  always  perform  better 
than  any  other  IS  technique.  [Srinivasan  2000]  contains  a  mathematical  proof  of  this. 

We  begin  by  returning  to  the  basic  expression  for  the  probability  of  false  alarm.  Let 
Y  =  YyjLi  Xj*  Then  observe  that 

PFA  =  P(Xo>t0Y\H0) 


=  1-E(I[Xo<t0Y}\H0)  (39) 

=  1  —  Eo(I[Xo  <  toY]\  Ho,Y), 

where  Eq  is  expectation  under  Hq.  Define  the  function  g(x)  =  1  —  Fxq\Ho,y{x)i  where 
FXo\HoAx)  ls  conditional  cumulative  distribution  function  of  the  statistic  of  the  CUT, 
given  Ho  is  true  and  given  Y.  Hence 

PFA  =  Eolg(T0Y)\, 

suggesting  that  IS  estimation  can  be  performed  by  biasing  Y  and  estimating  the  expecta¬ 
tion  of  g(joY). 


The  corresponding  IS  estimator  is 


p?s  (40) 
.  -/Vi 

J=1 

where  Yj  =  and  (T/, . . . ,  Y7Jn)  is  a  vector  of  length  m  generated  from  a  biasing 

distribution  with  density  /*.  The  weight  function  in  (40)  is  the  same  as  before,  except  it 
is  now  a  function  of  the  observational  cell  statistics  only.  The  function  /  is  also  now  a 
density  of  m  variables,  since  the  CUT  is  not  biased. 


The  variance  of  this  estimator  is  given  by 

V.{P*a)  =  T  [E4g2(ToY)W((Yl,...,Ym))\  -  P2FA 


(41) 


Assume  a  biasing  density  that  involves  a  single  biasing  parameter  9.  As  an  example,  this 
could  be  an  Ad  Hoc  biasing  density,  as  considered  previously.  Let 

Ig{0)  =  E,[g2(T0Y)W((Yl,...,Ym),9)].  (42) 
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In  view  of  (41)  and  (42),  we  want  to  find  a  9  that  minimises  Ig(0).  Hence  want  to  solve  for 
=  0.  This  will  often  be  quite  difficult  to  do  analytically.  Hence,  a  Newton- Raphson 
scheme  can  be  implemented,  so  that  starting  with  an  initial  approximation  9q, 

0j+i  =  0j-C§Y%\,  for  j  €  {1,2,...}.  (43) 

dSitgWj) 

The  parameter  £  is  used  to  speed  up  the  convergence,  but  can  be  set  to  1.  In  practice,  we 
will  need  to  estimate  Ig{9)  and  its  derivatives.  This  can  also  be  done  using  Monte  Carlo 
simulations.  Note  that 

Ig{9)  =  Jij‘2{TOy)W‘2({yl,y2,...,ym),0)f*(y)dy 

=  Jr92(Toy)W((yi,y2,...,ym),0)f(y)dy  (44) 

=  E[g2(T0y)W({yl,y2,...,ym),d)}. 


Hence,  differentiating  the  last  equality  in  (44),  and  changing  variables  in  terms  of  the 
expectation, 

^4(0)  =  E^g2(Toy)^W{{yi,y2,...,ym):0)^ 

=  Jr92(Toy)^W({yi,y2,...,ym,0)^W((yi,...,ym),9)f,{y)dy  (45) 

=  E^g2(TOy)^W((yi,y2,...,ym,e)^W({yi,...,jjm),0)  . 


Similarly,  it  can  be  shown  that 

cf 


dff 2 


4(0)  =  E. 


92(T0y)  !^^pW{{yi,y2,...,ym,0)^  W({yu. . .  ,ym),0)  .  (46) 


Monte  Carlo  techniques  can  be  applied  to  estimate  (45)  and  (46).  Simulations  are  based 
upon 


=  ^^92(royi)^W((y{,y32,...,y3m,9)^W({y{,...,yin),9) 

(47) 

_  1  "2  f  ffi  .  .  'I 

where,  for  each  j,  the  vectors  (yj, . . . ,  y 3m)  and  (^, . . . ,  2 3m)  are  independently  generated 
from  a  distribution  with  the  biasing  density  /*.  These  are  standard  Monte  Carlo  esti¬ 
mates,  even  though  they  are  based  upon  the  biasing  density,  due  to  the  fact  that  they  are 
estimating  an  expectation  with  respect  to  this  density.  In  principle,  IS  techniques  could 
also  be  employed  here,  but  it  makes  the  simulation  much  more  complicated. 
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Thus,  an  optimum  9  is  estimated  via  the  recursion 

0j+i  =  0j-  C-|r^T>  for  je  {1,2,...}.  (48) 

Wh\yi) 

Figure  2  in  the  Appendix  contains  plots  of  successive  recursions  of  the  scheme  (48).  It  is 
for  the  case  of  Gaussian  clutter,  with  square  law  detection.  The  three  cases  considered 
are  m  =  4  and  Pfa  =  10“4,  m  =  5  and  Pfa  —  10-6,  and  m  =  6  and  Pfa  =  10-8.  The 
parameter  £  =  1,  and  the  number  of  recursions  refers  to  the  parameter  j  in  (48).  The 
derivatives  in  (48)  were  estimated  using  (47).  The  plots  show  that  after  approximately  10 
recursions,  a  reasonable  estimate  of  9  is  obtained  in  each  case. 


The  simulation  gain  is 


r  = 


Pfa  ~  Pi 


FA 


m-r$A 

and  as  before,  we  can  estimate  Ig(9 )  via  a  Monte  Carlo  simulation: 


(49) 


Ig(e)  =  ~  Y.  g2{T0wj)W2{(w . .  ,w}m ),  9),  (50) 

*3  j=i 

with  (u;j, . . . ,  wJm )  generated  from  /*. 

Thus  the  G- Function  method  is  cpiite  involved,  but  the  level  of  simulation  improvement 
can  be  quite  considerable,  as  will  be  demonstrated  in  Section  4. 


3.3.1  Example:  CA-CFAR  with  Gaussian  Clutter 


We  now  turn  to  the  CA-CFAR  model  under  investigation,  where  we  assume  the  clutter 
and  noise  is  IID  exponentially  distributed  with  parameter  j-r  Since  we  are  looking  at 
developing  techniques  to  compare  CFAR  schemes  in  known  clutter  conditions,  we  assume 
this  parameter  is  known  and  fixed. 


The  clutter  density  is 


/(*)  =  n  Mxi)  =  7^e 


l  -i ym,Xj 

—  - P  n  Z-^J  =  1  j 


j= 1 


9' 


(51) 


[Srinivasan  2000]  suggests  using  a  single  parameter  Ad  Hoc  biasing  density:  the  marginal 
density  will  be 

1 _ z_ 

—  —e  ,  with  0  <  9  <  1, 

Li9 

and  so  the  joint  density  is 


Hi*) 


™  1  1  V-''  m 

=  n  (52) 

Since  we  are  assuming  IID  clutter  observations,  independent  of  the  CUT,  the  (/-function 
is  simply 

g{x)  =  1  -  FXoIY,h0{x)  =  e~i.  (53) 


19 


DSTO-TR  1624 


The  weight  function  can  be  shown  to  be 

W((y i ,  .  .  • ,  2/m),  9)  =  J  ^=1  .  (54) 


Finally,  the  simulation  gain  is 


PFA  ~  Pp.4 
I  m 


- 

m 

e 

-Pfa 

2(r0  +  l)-i_ 

(55) 


Figure  3  in  the  Appendix  shows  the  simulation  gain  (55),  as  a  function  of  9 ,  in  a  typical 
scenario.  There  is  clearly  a  6  that  will  provide  a  huge  simulation  gain.  With  the  corre¬ 
sponding  choice  for  0,  this  implies  that  for  the  same  level  of  variation,  this  IS  estimator 
will  require  significantly  less  simulation  runs. 


The  next  section  contains  an  analysis  of  some  simulations  of  the  IS  estimators  introduced 
in  this  Section. 
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4  Simulation  Comparison  of  Estimators 

Ill  this  section  we  briefly  examine  four  simulations  of  the  estimators  introduced  in  Section 
3.  Mathematical  analysis  implies  that  the  G-function  estimator  will  have  superior  perfor¬ 
mance  in  the  context  of  interest.  Hence  we  expect  it  to  provide  good  estimates  of  the  false 
alarm  probability  for  less  simulation  recursions. 

Each  G-function  estimate  has  been  produced  with  an  independent  estimate  of  the  biasing 
parameter  9.  In  the  Newton-Raphson  scheme  (48),  £  has  been  set  to  1,  and  the  derivative 
estimates  (47)  have  been  generated  with  100  recursions.  That  is,  K\  =  k*2  =  100. 

Figures  4  7  in  the  Appendix  contain  a  sample  of  simulations  of  the  three  estimators.  Figure 
4  contains  the  result  of  15  simulations  of  each  estimator.  In  each  case,  10  recursions  are 
used  to  generate  the  estimate.  The  Ad  Hoc  density  parameter  was  chosen  to  be  a  =  0.05. 

Also,  m  =  5  and  Pfa  =  10“6,  and  the  latter  is  plotted  on  the  graph  at  each  simulation 
point,  to  provide  a  comparison  of  the  estimates.  The  Ad  Hoc  estimates  are  all  zero,  because 
10  recursions  are  not  sufficient  for  this  estimator.  Simulation  experiments  showed  that  it 
can  take  an  enormous  number  of  simulations  for  this  estimator  to  produce  a  non-zero 
estimate.  The  Chernoff  estimator  performs  quite  well,  especially  given  the  small  recursion 
size  used.  However,  the  G-function  estimator  has  superior  performance. 

Figure  5  is  a  comparison  of  the  estimators  in  the  case  where  m  =  4  and  Pfa  =  10”4.  The 
Ad  Hoc  density  parameter  is  a  =  0.9,  and  again,  10  recursions  are  used  to  generate  the 
estimates.  As  in  the  previous  simulation,  we  see  the  same  behaviour  in  the  estimators. 

Figure  6  is  a  comparison  of  estimators  for  larger  recursions.  In  this  case,  the  number  of 
recursions  used  to  generate  the  estimates  is  100.  In  this  case,  we  set  m  —  5,  Pfa  —  10~3 
and  a  =  0.05.  The  same  behaviour  in  the  estimators  is  repeated. 

The  final  simulation  is  in  Figure  7,  where  m  =  5  and  Pfa  =  10“3.  10  simulation  recursions 
are  used  to  generate  the  estimates,  and  the  Ad  Hoc  estimate  is  not  included.  The  G- 
function  estimator  again  has  the  expected  superior  performance. 

An  interesting  observation  is  that  the  Ad  Hoc  biasing  density  has  performed  poorly  when 
used  in  a  standard  IS  scheme,  but  when  coupled  with  the  G-function  technique,  it  has  had 
superior  performance. 
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5  Conclusions 


This  report  has  analysed  Monte  Carlo  methods  used  for  the  estimation  of  false  alarm 
probabilities  in  a  simple  CA-CFAR  radar  detection  scheme.  Performance  analysis  of  radar 
detection  schemes  requires  the  estimation  of  such  probabilities.  Since  the  false  alarm 
probabilities  are  typically  very  small,  standard  Monte  Carlo  estimators  require  a  very  large 
number  of  runs  to  produce  a  reasonable  estimate.  Hence  methods  that  can  provide  a  high 
level  of  accuracy  on  small  sample  sizes  are  desired.  Importance  Sampling  techniques  permit 
such  estimations  to  be  made.  Three  IS  estimators  were  introduced:  Ad  Hoc,  Chernoff  and 
G-Function  approaches.  Both  the  Chernoff  and  G-Function  methods  perform  extremely 
well.  The  G-Function  method  clearly  has  superior  performance,  and  can  provide  accurate 
estimates  on  very  small  sample  sizes. 

Although  attention  was  restricted  to  the  simple  case  of  Gaussian  noise  and  clutter,  it  is 
important  to  reiterate  that  these  techniques  can  be  applied  in  cases  where  the  clutter  and 
noise  is  non-Gaussian.  The  IS  estimation  techniques  are  also  not  restricted  to  CA-CFAR, 
but  can  be  applied  to  other  CFAR  schemes.  This  will  be  investigated  in  future  work. 
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Appendix  A:  Figures 


Figure  1:  A  standard  CA-CFAR  Detector.  The  input  signal  is  passed  through  the  square 
law  detector ,  and  the  processed  return  is  separated  into  three  components.  The  cell  under 
test  (CUT),  and  two  sets  of  observations  (Test  cells  A  and  B),  from  which  a  chitter  measure 
is  extracted.  A  CFAR  process  is  then  performed  on  these  test  cells ,  with  the  result  passed 
into  a  third  process  that  combined  these  to  produce  an  average  clutter  measure .  In  the  case 
illustrated,  this  is  summing.  The  result  is  then  multiplied  by  a  threshold  constant,  and  this 
is  compared  to  the  CUT  statistic.  The  output  is  the  declaring  of  a  target  present ,  or  not 
present,  in  the  CUT. 
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Convergence  of  Theta 


Figure  2:  Convergence  of  9,  as  estimated  from  the  Newton- Raphson  scheme  (4$),  for  three 
particular  cases.  As  can  be  observed ,  a  reasonable  estimate  is  obtained  after  approximately 
10  recursions. 
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Gain  against  Theta 


Figure  3:  Simulation  gain  for  the  G- Function  estimator,  as  a  function  of  0.  There  is  a  9 
that  will  provide  a  gain  of  the  order  10' ,  which  will  result  in  huge  simulation  savings. 
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Comparison  of  Estimators 
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Figure  4:  A  comparison  of  the  three  IS  estimators  considered  in  this  report.  In  each 
case  15  simulation  runs  are  plotted  for  each  estimator,  with  each  IS  estimator  using  10 
recursions  to  generate  the  estimate.  The  ad  hoc  parameter  was  chosen  to  be  a  =  0.05. 
Additionally,  m  =  5  and  Pfa  =  10“6.  The  exact  false  alarm  probability  is  also  plotted  at 
each  simulation  point,  for  comparison. 
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Figure  5:  Comparison  of  the  three  estimators  in  the  case  where  m  =  4,  Pfa  =  10  4  and 
a  =  0.9.  Each  estimator  uses  10  recursions  to  produce  the  estimate . 
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Figure  6:  A  comparison  of  the  estimators  for  larger  recursions.  Here  m  =  5,  Pfa  =  10  3, 
a  =  0.05  and  the  number  of  simulation  recursions  is  100. 


30 


DSTO-TR-1624 


i.a 


-3 

x  10 


1.6* 


1.4* 


1.2 * 


£ 


0.8* 


0.6* 


0.4* 


0.2* 


Comparison  of  Estimators 

- »- 


O  Chernof 
♦  GFunct 
A  Exact 


0 


A  A  A  2 
♦ 


4 

A  A 


10 


15 


Simulation 


Figure  7:  This  is  a  comparison  of  the  G-function  and  Chemoff  IS  estimators ,  for  the  case 
where  m  =  5  and  P?a  =  10“3,  with  10  recursions  used  to  generate  the  estimates . 
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